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Abstract. This article demonstrates how an understanding of the geometry of a 
family of cost functions can be used to develop efficient numerical algorithms for 
real-time optimisation. Crucially, it is not the geometry of the individual functions 
which is studied, but the geometry of the family as a whole. In some respects, this 
challenges the conventional divide between convex and non-convex optimisation 
problems because none of the cost functions in a family need be convex in order 
for efficient numerical algorithms to exist for optimising in real-time any function 
belonging to the family. The title "Optimisation Geometry" comes by analogy from 
the study of the geometry of a family of probability distributions being called infor- 
mation geometry. 



Classical optimisation theory is concerned with developing algorithms that scale 
well with increasing problem size and is therefore well-suited to "one-time" optimi- 
sation tasks such as encountered in the planning and design phases of an engineering 
endeavour. Techniques from classical optimisation theory are often applied to "real- 
time" optimisation tasks in signal processing applications, yet real-time optimisation 
problems have their own exploitable characteristics. 

The often overlooked perspective this article brings to real-time optimisation prob- 
lems is that the family of cost functions should be studied as a whole. This leads to 
a nascent theory of real-time optimisation that explores the theoretical and practical 
consequences of understanding the topology and geometry of how a collection of 
cost functions fit together. 

For the purposes of this article, real-time optimisation is the challenge of develop- 
ing a numerical algorithm taking a parameter value 9 € as input, and returning 
relatively quickly a suitable approximation to an element of 



where the parametrised cost functions /(•; 9) are known in advance. Since combi- 
natorial and other non-smooth optimisation problems are less amenable to the meth- 
ods introduced in this article, for the moment it may be assumed that X and © are 
differentiable manifolds and f : X x & — >M. is a smooth function. (An important 
generalisation involving smooth fibre bundles will be introduced in Section 2.) 

An example of real-time optimisation in signal processing is maximum-likelihood 
estimation, where x is the parameter to be estimated from the observation 9 and 
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f{x\ 0) is the negative logarithm of the statistical likelihood function. In a commu- 
nications system, if the transmitted message is x and the received packet is then 
each time a new packet is received the optimisation problem (1) must be solved to 
recovers from 9. 

The distinguishing features setting apart real-time optimisation from classical op- 
timisation are: the class of cost functions /(•; 9) is known in advance; the class 
is relatively small (meaning © is finite-dimensional); an autonomous algorithm is 
required that quickly and efficiently optimises /(■; 9) for (almost) any value of 9. 

Real-time optimisation problems also differ from adaptive problems in that global 
robustness is important. Real-time algorithms must be capable of handling in turn 
any sequence of values for the parameter 9, whereas adaptive algorithms can assume 
successive values of 9 will be close to each other, thereby simplifying the problem 
to that of tracking perturbations. Nevertheless, there are similarities because it is 
proposed here, in essence, to solve real-time optimisation problems by reducing 
them to tracking problems. Geometry facilitates this reduction. 

The recent popularity of convex optimisation methods in signal processing exem- 
plifies the earlier remark that classical optimisation theory is often applied to real- 
time optimisation problems. While great benefit has come from the realisation that 
classes of signal processing problems can be converted into convex optimisation 
problems such as Second-Order Cone Programming problems, this approach does 
not exploit the relationships between the different cost functions in the same family. 

Although convexity currently determines the dichotomy of optimisation — convex 
problems are "easy" and non-convex problems are "hard" [12] — this is irrelevant 
for real-time optimisation because the complexity of real-time algorithms can be re- 
duced by using the results of offline computations made during the design phase. 
An extreme example is when all the cost functions /(■;(?) are just translated ver- 
sions of a cost function h(-), such as fix; 9) = h(x— 9). The cost function h might 
be difficult to optimise, but once its minimum has been found, the real-time op- 
timisation algorithm itself is trivial: given 9, the minimum of /(x; 9) = h{x — 9) is 
immediately computed to be x* + 0. 

This line of reasoning extends to more general situations. For concreteness, take the 
parameter space © to be the circle S l (or, in fact, any compact manifold). As before, 
each individual cost function /(•; 9) might be difficult to optimise, but provided the 
location of the minimum varies smoothly for almost every value of 9, the following 
(simplified) algorithm presents itself. Choose a finite number of parameter values 
01 , ■ ■ ■ , Q n € ©. Using whatever means possible, compute beforehand the minima 
xi, • •■ ,x„ glof the cost functions /(x; 0;), that is, f(xf) = min v /(x; 0,). The min- 
imum of /(•; 9) generally can be found quickly and reliably by determining the 0,- 
closest to 0, and starting with the pair (x,-, 0,), applying a homotopy method [ ] to 
find the minimum of successive cost functions /(■; 0,- + Ice) for k = 1 , ■ • • , K, where 
e = (0 — 9j)/K; see Section 5 for details. Thus, the overall complexity of real-time 
optimisation is determined by how the cost functions /(•; 0) change as is varied, 
and not by any classical measure of the difficulty of optimising a particular cost 
function in the family {/(•; 0) | e ©}. 

Another reason for believing in advance that the geometry of the family of cost 
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funtions as a whole will help determine the computational complexity of real-time 
optimisation is that work on topological complexity and real complexity theory has 
already demonstrated that the geometry of a problem provides vital clues for its 
numerical solution [14, 13, 4]. (Another example of the efficacy of using geometry 
to develop numerical solutions is [3].) 

Shifting from a Euclidean-based perspective of optimisation to a manifold-based 
perspective is expected to facilitate the development of a complexity theory for 
real-time optimisation. Moving to a differential geometric setting accentuates the 
geometric aspects while attenuating artifacts introduced by specific choices of coor- 
dinate systems used to describe an optimisation problem [8, 10, 7]. Furthermore, a 
wealth of problems occur naturally on manifolds [5, 8, 9], and coaxing them into a 
Euclidean framework is artificial and not necessarily beneficial. 
The flat, unbounded geometry of Euclidean space places no topological restrictions 
on cost functions / : M" K. Focusing on compact manifolds creates a richer 
structure for algorithms to exploit while maintaining practical relevance: compact 
Lie groups, and Grassmann and Stiefel manifolds occur in a range of signal process- 
ing applications. To the extent that no algorithm can search an unbounded region in 
finite time, the restriction to compact manifolds is not necessarily that restrictive. As 
a first step then, this article focuses on optimisation problems on compact manifolds. 
One way to visualise how the cost functions in a family fit together is to imagine 
the mapping 9 i-> /(•; 9) carving out a subset of the space of all (smooth) functions. 
This is essentially the approach taken in information geometry [2], where /(•; 9) is 
a probability density function rather than a cost function. It seems appropriate then 
to use Optimisation Geometry as the title of this article. 

Tangentially, it is remarked that even for one-time optimisation problems, it is not 
clear to the author that convexity is the fundamental divide separating easy from 
hard problems. Convexity might be an artifact of focusing on optimisation problems 
on R" rather than on compact manifolds. There do not exist any nontrivial convex 
functions / : M — > K on a compact connected manifold M — if / is convex [15] then 
it is necessarily a constant — yet if M were a circle or a sphere, presumably there 
are numerous classes of cost functions that can be "easily" optimised. 

2 A Fibre Bundle Formulation of Optimisation 

A real-time optimisation algorithm computes a possibly discontinuous mapping g 
from © to X. Given an input 9 e 0, the algorithm returns g(9) € X where g satisfies 

f(g(e);8)=mmf(x;e) (2) 

X 

for all, or almost all, 9 E ®. (Randomised algorithms are not considered here.) In a 
certain sense then, the additional information contained in the cost functions /(■; 0) 
is irrelevant; if a closed-form expression for g can be determined then the original 
functions / can be discarded. 

However, often in practice it is too hard (or not worth the effort) to find g explicitly. 
Optimisation algorithms therefore typically make use of the cost function, finding 
the minimum by moving downhill, for example. With the caveat that there is no need 
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to remain with the original cost functions f{-\9) — they can be replaced by any 
other family provided there is no consequential change to the "optimising function" 
g — a first attempt at studying the complexity of real-time optimisation problems 
can be made by endeavouring to link the geometry of / with the computational 
complexity of evaluating the optimising function g. 

Define M to be the product manifold M = X x @, and let % : M — > © denote the 
projection (x,0) i— > 9. The family of cost functions /(•; 9) can be thought of as a 
single function / : M — »• R, that is, as a scalar field on M. Provided / : M — >• R is 
smooth, the manifold M relates to how the cost functions fit together. 

If / is not smooth, a reparametrisation of the family of cost functions could be sought 
to make it smooth; in essence, a parametrisation 9 n- /(■; 9) is required for which 
smooth perturbations of 9 result in smooth perturbations of the corresponding cost 
functions. To increase the chance of this being possible, an obvious and notationally 
convenient generalisation of the real-time optimisation problem is introduced. 

Definition 1 (Fibre Bundle Optimisation Problem). Let M be a smooth fibre bundle 
over the base space © with typical fibre X and canonical projection n : M — > 0. 
Let / : M — > R be a smooth function. The fibre bundle optimisation problem is 
to devise an algorithm computing an optimising function g : © — >• M that satisfies 
(nog)(9) = 9 and (fog)(6) =mm peK - l{e) f(p) for all Be®. 

Standing Assumptions: For mathematical simplicity, it is assumed throughout that 
M, © and X in Definition 1 are compact. Smoothness means C°°-smoothness. 

If M = X x © then the only difference from before is that the output of the algorithm 
is now a tuple (x*,9) GM rather than merely E X. Allowing M to be a non-trivial 
bundle is useful in practice, as now demonstrated. 

Example 2. Let M and be compact connected manifolds. If n : M — > is a 
submersion then it is necessarily surjective and makes M a fibre bundle. Given 
a smooth / : M R, the fibre bundle optimisation problem is equivalent to the 
constrained optimisation problem of minimising f(p) subject to n(p) = 9. 

Example 3. Let St(fe,n) = {X e R nxk | X T X = 1} denote a Stiefel manifold and 
0(Jt) = {X e R kxk | X T X = 1} an orthogonal group. The Grassmann manifold 
Gr(k,n) is a quotient space of St(fe,n), and in particular, St(k,n) decomposes as 
a bundle % : St(k,n) — > Gr(^,«) with typical fibre 0(£). Given a smooth function 
/ : St(k,n) —> R, the corresponding fibre bundle optimisation problem is to min- 
imise f(X) subject to the range-space of X being fixed (that is, that k(X) is known). 
A related optimisation problem (involving a constraint on the kernel rather than the 
range-space of X) occurs naturally in low-rank approximation problems [11]. 

The optimisation problem in Example 3 can be written in parametrised form by 
changing / to / : Gr(£, n) x 0(k) — > R, but if / is smooth then / need not be contin- 
uous. Fibre bundles allow for twists in the global geometry. 

Example 4. Another decomposition of St(k,n) is % : St(k,n) — > S"~ l where 7t(X) is 
the first column of X. This corresponds to interpreting an element X € St(k,n) as a 
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point in the (k — 1 ) -dimensional orthogonal frame bundle of the (n — 1 ) -dimensional 
sphere. More generally, fibre bundle optimisation problems arise whenever a smooth 
function / is defined on a tangent bundle, sphere bundle, (orthogonal) frame bundle 
or normal bundle of a manifold M, and it is required to optimise f(p) subject to p 
being constrained to lie above a specified point on M. 

Remark 5. Fibre bundle optimisation problems (Definition 1 ) decompose into lower- 
dimensional fibre bundle optimisation problems. If is a submanifold of © then the 
restriction of % to n~ 1 (0) makes MP 7t~ 1 (©) into a fibre bundle over 0. Conversely, 
a fibre bundle optimisation problem can be embedded in a higher-dimensional fibre 
bundle optimisation problem. 

The optimising function g in Definition 1 would be a section if it were smooth, but 
in general g need not be everywhere continuous much less smooth. This is handled 
by imposing a niceness constraint on the optimisation problem. 

Definition 6 (Niceness). The fibre bundle optimisation problem in Definition 1 is 
deemed to be nice if there exist a finite number of connected open sets 0, C © 
whose union is dense in 0, and there exist smooth local sections gi : 0, — > M such 
that (fo gi )(6) — min pe7[ -i(Qjf(p) whenever 9 £ ©/. 

The requirement that the gi are sections means n(gj(9)) = 9 for every 9 <G ©,. The 
smallest number of connected open sets required in Definition 6 can be considered 
to be the topological complexity of the optimisation problem by analogy with the 
definition of topological complexity in [ ]; note though that the gi are required to 
be smooth in Definition 6 whereas Smale required only continuity. 

Remark 7. A more practical definition of niceness might require the 0, in Defi- 
nition 6 to be semialgebraic sets, perhaps even with a limit placed on the number 
of function evaluations required to test if 9 is in ©,. This is not seen as a major 
issue though because it is always possible to evaluate more than one of the gi at 
9 and choose the one which gives the lowest value of f(gi(9)); the algorithm for 
computing gi can return whatever it likes if 9 $ ©,. See also Section 5. 

Whereas Section 1 only required a real-time optimisation algorithm to compute the 
correct answer for almost all values of 9, the standing assumption of compactness 
together with restricting attention to nice problems means the algorithm can be re- 
quired to work for all 9; see Remarks 8 and 14. 

Remark 8. The compactness of M means that if 9„ £ ©,-, 9„ 9 then {gi(O n )}™ =l 
has at least one limit point, call it q. Then n(q) = 9 and f(q) = mxn pej[ -i (e)/(p)- 
Therefore, if a fibre bundle optimisation problem is nice (Definition 6) then an opti- 
mising function exists on the whole of (Definition 1). 

Remark 9. In Definition 1, the geometry of the optimisation problem is encoded 
jointly by M and /. It is straightforward to reduce / to a canonical form by replacing 
M with the graph T = {(p,f{p)) G M X R \ p £ M}. Then / becomes the height 
function (x,y) i-> y and the geometry of the optimisation problem is encoded in how 
T sits inside MxR. 
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As a visual aid, it can be assumed, from Remark 9 and the Whitney embedding theo- 
rem, that M is embedded in Euclidean space and the level sets f~ 1 (c) are horizontal 
slices of M. 

3 The Torus 

To motivate subsequent developments, this section primarily considers fibre bun- 
dle optimisation problems on the product bundle M = S 1 x S . The function / : 
S l x S l — > R can be thought of as defining the temperature at each point of a torus. 
Definitions and results will be stated in generality though, for arbitrary M. 

3.1 Fibre-wise Morse Functions 

Minimising /:S'xS'^l restricted to a fibre is simply the problem of minimising 
a real-valued function on a circle. The smoothness of / and the compactness of S l 
ensure the existence of at least one global minimum per fibre. 

To give more structure to the set of critical points, it is common to restrict attention 
either to real-analytic functions or Morse functions. Optimisation of real-analytic 
functions will not be considered here, but may well prove profitable for the study of 
gradient-like algorithms for fibre bundle optimisation problems. 

If h : S 1 -> R is Morse, meaning all its critical points are non-degenerate, then its 
critical points are isolated and hence finite in number. Furthermore, the Newton 
method for optimisation converges locally quadratically to non-degenerate critical 
points. These are desirable properties that will facilitate the development of optimi- 
sation algorithms in Sections 4 and 5. 

Definition 10 (Fibre-wise Morse Function). Afibre-wise critical point p of the func- 
tion / in Definition 1 is a critical point of f\ n -it n (B\y the restriction of / to the fibre 
K~ l (n(p)) containing p. It is non-degenerate if the Hessian of f\ n -ii n i p x\ at p is 
non-singular. If all fibre-wise critical points of / are non-degenerate then / is a 
fibre-wise Morse function. 

Remark 1 1. Note that / being fibre-wise Morse differs from / being Morse; a non- 
degenerate fibre-wise critical point need not be a critical point of /, and even if it 
were, it need not be non-degenerate as a critical point of /. 

Lemma 12. Let f : M — > M be a fibre-wise Morse function (Definition 10) on the 
bundle % : M — > ( Definition 1 ). The set N of fibre-wise critical points is a subman- 
ifold of M with the same dimension as ©. It intersects each fibre 7T _1 (0) transver- 
sally. 

Proof. It suffices to work locally; let U C ® be open. Denote by VM the vertical 
bundle of M; it is a subbundle of the tangent bundle TM. Letsi,--- ,sj : 7T (£/) 
VM be a local basis, where k = dimM — dim©. (The Sj are local smooth sections of 
VM such that {s\ (p), ■ ■ ■ ,Sk(p)} is a basis for V p M for every p e K~ l {U).) Define 
e : 7Z~ l (U) -> R^ by e(p) = (df(si(p)),--- ,df(s k (p))). Then the set of fibre-wise 
critical points is given locally by N P\ n~ l (U) = e _1 (0). Fix p E N. Since / is 
fibre-wise Morse, de p restricted to V p M is non-singular. Therefore de p is surjective 
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and ker de p + V p M = T p M. Thus, e 1 (0) is an embedded submanifold of M, it has 
dimension dimM — k = dim©, and it intersects each fibre transversally. □ 

The situation is especially nice on the torus: Lemma 12 implies that the set N of 
fibre-wise critical points of a fibre-wise Morse function is a disjoint union of a fi- 
nite number of circles, with each circle winding its way around the torus the same 
number of times. Precisely, there is an integer b such that for any 9, each connected 
component of N intersects the fibre n~ l (9) = S 1 x {9} precisely b times. 

As soon as the fibre-wise critical points of / are known at a sing le fibre 7T 1 (0), the 
fibre-wise critical points of / at another fibre K~ l (Q') can be determined by tracking 
each of the points mNC\ 7Z~ l (9) as 9 moves along a continuous path to 9' '. This is 
referred to as following the circles in N from one fibre to another. 

Investing more effort beforehand can obviate the need to follow more than one circle. 
A lookup table can record the circle in N on which the minimum lies based on which 
region contains 9. Proposition 13 formalises this. (In practice, there may be reasons 
for deciding to track more than one circle; see Remark 7.) 

Proposition 13. Iff in Definition 1 is fibre-wise Morse (Definition 10) then the fibre 
bundle optimisation problem is nice (Definition 6). 

Proof. Let N be the set of fibre- wise critical points of /. For 9 G ®, ATPl 

is a finite set of points because n~ 1 (9) is compact and N itl 7t~ l (0) with dimAf + 

dim7T _1 (0) = dimM; see Lemma 12. Therefore there exist an open neighbour- 

hood Ug C © of 9 and local smooth sections s\ , • ■ ■ ,s K k ' : Ug — > M such that 

N n n~ l (Uo) = U^jjj (Ug); pictorially, each section traces out a distinct com- 
ponent of N fl K (Ug). Let Vg C © be an open neighbourhood of 9 whose closure 
Vg is contained in Ug. By compactness there exist a finite number of the Vg which 

cover ©; denote these sets by Vg r Let J tj = {9 eVg~,\ /(-sf^C )) = h ( 9 )} where 
h(9) = min pg7! .-i ^f(p). Each Jij is a closed subset of Vg. because h is continu- 
ous. Furthermore, U;/,-; = Vg j and hence 1%/;; = 0. Let denote the interior 
of /,-;. Since Jij \ ©,- ; is nowhere dense, U i7 @ i; - is dense in 0. The requirements of 

rg\ 

Definition 6 are met with gij(9) = Sj (6). □ 

Remark 14. A stronger definition of niceness could have been adopted: each gi in 
Definition 6 could have been required to be a smooth optimising function on ©,, the 
closure of 0,. Also, because there are only a finite number of sets involved, U,0, is 
dense in © if and only if U,©, = ©. 

3.2 Connection with Morse Theory- 
It is natural to ask what role Morse theory plays in real-time optimisation. After all, 
Morse theory contributes to one-time optimisation problems by providing informa- 
tion about the number, type and to some extent the location of critical points. 

The short answer is the connection between Morse theory and real-time optimisa- 
tion is more subtle than for one-time optimisation. The fibre bundle formulation 
of real-time optimisation highlights that real-time optimisation is concerned with 
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constrained optimisation. It is not the level sets {p € M | f(p) = c} that are im- 
portant for real-time optimisation but how they intersect the fibres 7t~ l (8). From 
an algorithmic perspective, whereas one-time optimisation algorithms are required 
to find (isolated) critical points, real-time optimisation algorithms (at least from the 
viewpoint of this article) are required to track the critical points from fibre to fibre. 

Nevertheless, for completeness, this section recalls what classical Morse theory says 
about the torus. Let / : M — > R be a smooth Morse function on M = S l x S l with 
distinct critical points having distinct values. This is a mild assumption in practice 
because an arbitrarily small perturbation of / can always be found to enforce this. 

Morse theory explains how the level sets / _1 (c) fit together to form M. The fibre 
bundle optimisation problem is to find the smallest c for which /~ 1 (c) intersects the 
submanifold 7i~ l (8) for a given 8. 

If p £ 71 (8) is a local minimum of / then it is also a local minimum of f\ n -i(Q\, 
and similarly for a local maximum. In both cases, p is an isolated critical point of 
/U-!(e)- This need not be true though if p is a saddle point of /. 

Let po,--- ,p n -i denote the critical points of / ordered so the values c, = /(/?,) 
ascend. The genus of the torus dictates that the number of saddle points equals the 
total number of local minima and maxima, therefore n>4. 

For eg [cq , c„_ i ] a regular value of /, f~ 1 (c) is a compact one-dimensional manifold 
and hence diffeomorphic to a finite number of circles. The number of circles changes 
by one as c passes through a critical value. In particular, / _1 (co) is a single point, 
f~ l (c) for c S (co,ci) is diffeomorphic to S 1 , and / _1 (ci) is either diffeomorphic 
to a circle plus a distinct point, or it is diffeomorphic to two circles joined at a single 
point. In general, / (c,) is either diffeomorphic to zero or more copies of a circle 
plus a distinct point, or it is diffeomorphic to zero or more copies of a circle plus 
two circles joined at a single point. The former occurs when /?, is a local extremum 
and the latter occurs when pi is a saddle point. 

Not only is /~' (c) diffeomorphic to a finite number of circles for c a regular value, 
but 7T _1 (8) is also diffeomorphic to a circle. Visually then, increasing c corresponds 
to sliding one or more rubber bands along the surface of the torus, and of interest is 
when one of these rubber bands first hits the circle n~ l (8). The point of first contact 
is either a critical point of / or a non-transversal intersection point of f~ 1 (c) n 
K~ l {8). Indeed, if p S / (c) n 7t (8) is not a critical point of / then p is a 
critical point of f\ n -itg\ if and only if p is a non-transversal intersection point of 
/ _1 (c) with 7T _1 (8). This connects with Definition 10. 

4 Newton's Method and Approximate Critical Points 

The Newton method is the archtypal iterative algorithm for function minimisation. 
Whereas its global convergence properties are intricate — domains of attraction 
can be fractal — the local convergence properties of the Newton method are well 
understood. The advantage of real-time optimisation over one-time optimisation 
is it suffices to study local convergence properties of iterative algorithms because 
suitable initial conditions can be calculated offline. 
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The concept of an approximate zero was introduced in [4] . An equivalent concept 
will be used here, however subsequent developments differ. In [4], attention was 
restricted to analytic functions and global constants were sought for use in one-time 
algorithms (for solving polynomial equations), as opposed to the focus here on real- 
time optimisation algorithms. 

The Newton iteration for finding a critical point of h : R" — s- R is Xk+\ = Xk — 
\h" (xk)} h! {xk) ■ Its invariance to affine changes of coordinates means it suffices 
to assume in this section that the critical point of interest is located at the origin. The 
Euclidean norm and Euclidean inner product are used throughout for R". 

Definition 15 (Approximate Critical Point). Let /; : R" — >• R be a smooth function 
with a non-degenerate critical point at the origin: Dh(Q) = and D 2 h(0) is non- 
singular. A point x is an approximate critical point if, when started at xo = x, the 
Newton iterates x\ at least double in accuracy per iteration: \\xk+\ || < 4 11**11- 

Provided the critical point is non-degenerate, the set of approximate critical points 
contains a neighbourhood of the critical point. For the development of homotopy- 
based algorithms in Section 5, it is desirable to have techniques for finding a p > 
such that all points within p of the critical point are approximate critical points. Two 
techniques will be explored, starting with the one-dimensional case for simplicity. 

Example 16. Let h(x) = x 2 +x 3 . Then h'(x) = 2x + 3x 2 and h"(x) =2 + 6x. The 
Newton iterate is x H> x — ^fq|§j = 2Tfa- Graphing this function shows that the 
largest interval [— p , p] containing only approximate critical points is constrained by 
the equation = — | for x < 0. In particular, p = g sa 0. 17 is the best possible. 

Explicit calculation as in Example 16 is generally not practical. It will be assumed 
that on an interval / containing the origin the first few derivatives of h are bounded. 
Since h'(0) = 0, a basic approximation for h'{x) on / is h'(x) = xh"(x) for some 
x G /• It follows that if h" (x) / h" (x) is bounded between | and | for x,x G / then all 
points in / are approximate critical points. Moreover, h"'(x) can be used to bound 
the change in h" (x) . This makes plausible the following lemma. 

Lemma 17. Let h : R — > R be a smooth function with h'(0) = and h"(0) ^ 0. Let 
I be an interval containing the origin and a = sup^ Let p = ^ 2 a^ • Then 

every point in the interval [— p,p] R7 is an approximate critical point ofh. 

Proof. Follows from Proposition 22 upon noting that h"(x) — h"(y) = h"'(x)(x — y) 
for some x lying between x and y. □ 

Example 18. In Example 16, h"'(x) = 6. Applying Lemma 17 gives p = 4 m 0.17, 
coincidentally agreeing with the best possible bound. 

h' (x) 

The second technique is to look at the derivative of the Newton map xt-t-x— p^j, 

h'(x)h"'(x) 1 

which is t^Tfcip ■ Provided the magnitude of this derivative does not exceed j then 
x is an approximate critical point. 
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Example 19. In Example 16, - ^p^pr = ^^gr ■ ^ ts magnitude does not exceed 5 
provided |jc| < « 0.06. 

h'(x)h'"(x) 

The need for evaluating knh^i can t> e avoided by using bounds on derivatives; 
an upper bound on gives a lower bound, linear in x, on h"{x), and an upper 

bound, quadratic in x, on Nevertheless, the first technique appears to be 

preferable, and will be the one considered further. 

Lemma 20. Let h : M." — ^ M have a non-degenerate critical point at the origin. Let 
H x G M" x ", a symmetric matrix, denote its Hessian at x, that is, D 2 h(x) = 
(H x % , 4). ief i?, denote the averaged Hessian H x = J Q H tx dt. Then x is an approxi- 
mate critical point if\\H x l H x < \, where the norm is the operator norm. 

Proof. The gradient of h at x is J H tx xdt = H x x. Therefore the Newton map is 
x^x-H x l H x x. lf\\H x l H x -I\\ < \ then \\x~H x l H x x\ < ±||x||, as claimed. □ 

Lemma 21. With notation as in Lemma 20, if\\H x -H \\ < ||#o ^l" 1 then 

Proof. Let A = -(H x -H )H l . Then ||A|| < \\H X - H \\\\H l \\ < 1. Therefore 
|| (/-A)" 1 1| = ||/+A+A 2 + ---|| < 1 + ||A|| + ||A|| 2 +--- = (1-llAll)" 1 . Moreover, 

ll^-T 1 ^ — ^|| = H.HS" 1 ^— ^) _1 C^ — < II^'IKI — ll^ll)- 1 !!^ — J^cll- finally, 
note(l-||A||)- 1 <(l-||// v -// ||||// ( 7 1 ||)- 1 . □ 

A bound on the third-order derivative yields a Lipschitz constant for the Hessian. 

Proposition 22. Define h and H x as in Lemma 20. Let I be a star-shaped region 
about the origin. Let CC G R be such that \\H X — H y \\ < a\\x — y\\ for x,y G /. Let 
p = (2a||// _1 ||) _1 . Ifx G / and ||x|| <p then x is an approximate critical point. 

Proof. First, \\H X -H X \\ < /J \\H tx -H x \\dt < a\\x\\ $ 1 -tdt = § ||x||. Also, \\H X - 
H4 < a\\x\\ < iHVir 1 ' Lemma21 implies \\H- l H x -I\\ < ^J^^-t^r 
The result now follows from Lemma 20. □ 



5 A Homotopy-based Algorithm for Optimisation 

This section outlines how a homotopy-based algorithm can solve fibre bundle opti- 
misation problems efficiently. 

Homotopy-based algorithms have a long history [ ]. Attention has mainly focused 
on one-time problems where little use can be made of results such as Proposition 22 
requiring the prior calculation of various bounds on derivatives and locations of crit- 
ical points. Time spent on prior calculations is better spent on solving the one-time 
problem directly. The reverse is true for real-time algorithms. The more calculations 
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performed offline, the more efficient the real-time algorithm can be made, up until 
when onboard memory becomes a limiting factor. 

Definition 6 may make it appear that nice optimisation problems are not necessarily 
that nice if the sets ©, are complicated. However, it is always straightforward to find 
fibre-wise critical points by path following. The worst that can happen if the ©,■ are 
complicated is that the algorithm may need to follow more than one path because it 
cannot be sure which path contains the sought after global minimum. 

Proposition 23. With notation as in Lemma 12, let y : [0, 1] — > be a smooth path. 
Let p G NO 7T~' (y(0)). Then 7 lifts to a unique smooth path 7: [0, 1] — > N such that 
7(0) = p and n{j{t)) = 7(f) for t G [0, 1]. 

Proof. Follows from Lemma 12 in a similar way Proposition 13 did. □ 

Corollary 24. With notation as in Lemma 12, the number of points in the set N (~1 
7t~ l (9) is constant for all € ©. 

Different paths with the same end points can have different lifts. Nevertheless, as the 
number of fibre-wise critical points is constant per fibre, as soon as the fibre-wise 
critical points on one fibre are known, the fibre-wise critical points on any other 
fibre can be found by following any path from one fibre to another. Furthermore, 
only paths containing local minima need be followed to find a global minimum. 

Proposition 25. With notation as in Lemma 12, let p and q lie on a connected 
component ofN. Then p is a fibre-wise local minimum if and only if q is a fibre-wise 
local minimum. 

Proof. Fibre-wise, each critical point is assumed non-degenerate. Therefore, along 
a continuous path, the eigenvalues of the Hessian cannot change sign and the index 
is preserved. □ 

Referring to Proposition 25, define N C N to be the connected components of N 
corresponding to fibre-wise local minima. 

An outline of a homotopy-based algorithm for fibre bundle optimisation problems 
can now be sketched. It will be refined presently. It relies on several lookup tables, 
the first of which has entries (0, 7t~ l (0) (IN) for e {0i , ■ • • , 0„} C 0. That is to 
say, the set of all local minima of / restricted to the fibres over 9\ , ■ ■ ■ ,9 n , have been 
determined in advance. 

1 . Given 9 as input, determine an appropriate starting point 0, from the finite set 
{9i,- ■■ ,9 n }. 

2. Determine an appropriate path 7 from 0, to 0. 

3. Track each fibre-wise critical point p G n~ l (9j) ON along the path 7 (i.e., 
numerically compute the lift 7 defined in Proposition 23). 

4. Evaluate the cost function / at the fibre-wise local minima on the fibre w _1 (0) 
to determine which are global minima. Return one or all of the global minima. 
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Step 3 can be accomplished with a standard path-following scheme [I]. A refine- 
ment is to utilise Proposition 22, as now explained. Using a suitably chosen local 
coordinate chart, the cost function / restricted to a sufficiently small segment of the 
path 7 can be represented locally by a function /i:R"xl4 M. Here, h should be 
thought of as a parametrised cost function, with h (■; 0) the starting function having a 
non-degenerate critical point at the origin, and the objective being to track that crit- 
ical point all the way to the cost function /;(•; 1 ). An a priori bound on the location 
of the critical point of h(-\t) is readily available; see for example [6, Chapter 16]. 
Similarly, Proposition 22 gives a bound on how far away from the critical point the 
initial point can be whilst ensuring the Newton method converges rapidly. There- 
fore, these two bounds enable the determination of the largest value of t £ [0, 1] such 
that, starting at the origin, the Newton method is guaranteed to converge rapidly to 
the critical point of h(-\t). Once that critical point has been found, a new local chart 
can be chosen and the process repeated. 

These same bounds, which are pre-computed and stored in lookup tables, permit the 
determination of the number of Newton steps required to get sufficiently close to 
the critical point. For intermediate points along the path, it is not necessary for the 
critical points to be found accurately. Provided the algorithm stays within the bound 
determined by Proposition 22, the correct path is guaranteed of being followed. 
The fact that M may be a manifold presents no conceptual difficulty. As in [8], 
it suffices to work in local coordinates, and change charts as necessary, as already 
mentioned earlier. 

Steps 1 and 2 of the algorithm pose three questions. How should the set { 6\ , ■ ■ ■ ,9„} 
be chosen, how should a particular 0, be selected based on 0, and what path should 
be chosen for moving from 0, to 0? Importantly, the algorithm will work regardless 
of what choices are made. Nevertheless, expedient choices can significantly enhance 
the efficiency of the algorithm. 

Another refinement is to limit in Step 3 the number of paths that are followed. Propo- 
sition 13 ensures that it is theoretically possible to determine beforehand which path 
the global minimum will lie on. Therefore, with the use of another lookup table, the 
number of paths the algorithm must track can be reduced; see Remark 7. 

6 Conclusion 

A nascent theory of optimisation geometry was propounded for studying real-time 
optimisation problems. It was demonstrated that irrespective of how difficult an in- 
dividual cost function might be to optimise offline, a simple and reliable homotopy- 
based algorithm can be used for the real-time implementation. 
Real-time optimisation problems were reformulated as fibre bundle optimisation 
problems (Definition 1). The geometry inherent in this fibre bundle formulation 
provides information about the problem's intrinsic computational complexity. An 
advantage of studying the geometry is it prevents any particular choice of coor- 
dinates from dominating, so there is a possibility of seeing through obfuscations 
caused by the chosen formulation of the problem. 

That geometry helps reveal the true complexity of an optimisation problem can be 
demonstrated by referring back to the discussion of the fibre bundle optimisation 
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problem on the torus in Section 3. Irrespective of how complicated the individual 
cost functions are (but with the proviso that they be fibre-wise Morse), the fibre-wise 
critical points will lie on a finite number of circles that wind around the torus, and 
because these circles cannot cross each other, or become tangent to a fibre, they each 
wind around the torus the same number of times. Therefore, in terms of where the 
fibre-wise critical points lie, the intrinsic complexity is encoded by just two integers: 
the number of circles, and the number of times each circle intersects a fibre. 

Although this article lacked the opportunity to explore this aspect, a crucial ob- 
servation is even though it may appear that some problems are more complicated 
than others because the paths of fibre-wise critical points locally "fluctuate" more, a 
smooth transformation can be applied to iron out these fluctuations. Smooth trans- 
formations cannot change the intrinsic complexity whereas they can, by definition, 
eliminate extrinsic complexity. 

The second determining aspect of complexity is the number of times the fibre-wise 
minimum jumps from one circle to another. This is precisely what is counted by the 
topological complexity, mentioned just after Definition 6. 

For higher dimensional problems, attention can always be restricted to compact one- 
dimensional submanifolds of the parameter space ©, in which case the situation is 
essentially the same as for the torus; see Remark 5. The only difference is the 
circles may become intertwined. The theory of links and braids may play a role in 
further investigations, for if two circles are linked then no smooth transformation 
can separate them. 

Another potentially interesting direction for further work is to explore the possibility 
of replacing a family of cost functions with an equivalent family which is computa- 
tionally simpler to work with but which gives the same answer. 

There are myriad other opportunities for refinements and extensions. The theory 
presented in this article was the first that came to mind and may well be far from 
optimal. 
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